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DETAILED ACTION 

1 . This non-final office action is in response to the Pre-Appeal Brief Request filed 12 
June 2008. 

2. Claims 3-13, 16-19, and 21-25 are pending. Claims 21, 23, and 25 are 
independent claims. 

The rejection of claims 3, 7-11,13,1 6, 21 , 23, and 25 under 35 USC 1 03 over 
Meyerzon et al. (US 6638314, filed 26 June 1998, hereafter Meyerzon) in further view of 
Blumenthal (US 6026409, filed 26 June 1998) and further in view of Koike et al. (US 
7194678, filed 1 March 2000, hereafter Koike) has been withdrawn in view of the 
applicant's remarks. 

The rejection of claims 4-6 and 17-19 under 35 USC 103 over Meyerzon, 
Blumenthal, and Koike, and further in view of Hobbs (US 6523022, filed 7 July 1999) 
has been withdrawn in view of the applicant's remarks. 

The rejection of claims 12, 22, and 24 under 35 USC 103 over Meyerzon, 
Blumenthal, and Koike, and further in view of Lawrence et al. (US 6289342, filed 20 
May 1998, hereafter Lawrence) has been withdrawn in view of the applicant's remarks. 

3. In view of the Pre-Appeal Brief Request for Review filed on 1 2 June 2008, 
PROSECUTION IS HEREBY REOPENED. New grounds of rejection are set forth 
below. 
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To avoid abandonment of the application, appellant must exercise one of the 
following two options: 

(1 ) file a reply under 37 CFR 1.111 (if this Office action is non-final) or a reply 
under 37 CFR 1 .1 1 3 (if this Office action is final); or, 

(2) initiate a new appeal by filing a notice of appeal under 37 CFR 41 .31 followed 
by an appeal brief under 37 CFR 41 .37. The previously paid notice of appeal fee and 
appeal brief fee can be applied to the new appeal. If, however, the appeal fees set forth 
in 37 CFR 41 .20 have been increased since they were previously paid, then appellant 
must pay the difference between the increased fees and the amount previously paid. 

A Supervisory Patent Examiner (SPE) has approved of reopening prosecution by 
signing below. 

Claim Rejections - 35 USC § 103 

4. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all 
obviousness rejections set forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set 
forth in section 102 of this title, if the differences between the subject matter sought to be patented and 
the prior art are such that the subject matter as a whole would have been obvious at the time the 
invention was made to a person having ordinary skill in the art to which said subject matter pertains. 
Patentability shall not be negatived by the manner in which the invention was made. 

5. Claims 3, 7-1 1 , 1 3, 1 6, 21 , 23, 25 are rejected under 35 U.S.C. 1 03(a) as being 
unpatentable over Meyerzon et al. (US 6638314, filed 26 June 1998, hereafter 
Meyerzon) in further view of Tuli (US 6874009, filed 16 February 2000) and further in 
view of Koike et al. (US 7194678, filed 1 March 2000, hereafter Koike). 
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As per independent claim 21 , Meyerzon discloses a method for indexing data 
documents, the method comprising: 

Retrieving, to a server, with a web crawler from a network address, a data 
document with client-side scripting code therein (Figure 2: Here, a web crawler server is 
implemented between a client and a web server) 

Executing, at the server, a web-browser, as part of the web crawler, wherein the 
web-browser renders an in-memory copy of the data document which has been 
retrieved, wherein the in-memory copy of the data document maintains a web-browser 
display format and a web-browser display layout of the data document when displayed 
in the web browser (Meyerzon Col 7 Lines 60-65 and Col 8 Lines 15-20: Here, the 
crawler acts as a web browser in that it requests the web page data. These requested 
web page documents are stored in memory in a display format) 

Executing, at the server instead of a client system, a browser scripting engine as 
part of the web-browser for loading content as directed by the client-side scripting code 
into the in-memory copy creating a final web-browser display representation of the 
dynamic data document so that the final web-browser display representation is 
substantially similar to when the data document is viewed by a user in the user's web- 
browser running on the client system when all the data is viewed (Meyerzon Col 7 Lines 
60-65 and Col 8 Lines 15-20) 

Indexing, at the server, the content in the memory, wherein the content being 
indexed is the content which has been loaded by the browser scripting engine in order 



Application/Control Number: 09/607,370 Page 5 

Art Unit: 2178 

to index the data document as if being viewed by the user in the user's web-browser on 
the client system (Figures 4-5). 

Meyerzon does not specifically mention wherein the server processing unit 
renders the in-memory webpage prior to analyzing and summarizing the in-memory 
webpage. However, Tuli discloses rendering the webpage at a server prior to analyzing 
and summarizing the webpage (column 5, lines 39-50; Here, a webpage is received at a 
server from a remote device. Upon receipt, the entire webpage document is rendered. 
After this rendering the page is divided, compressed, and transmitted to a device 
remote to the server). It would have been obvious to one of ordinary skill in the art at the 
time of the invention, to apply Tuli to Meyerzon, providing Meyerzon the benefit of 
rendering the document prior analyzing, thereby ensuring appropriate analysis of the 
document. 

Meyerzon does not specifically disclose wherein the data document is a dynamic 
data document, wherein an in-memory copy of a dynamic data document is rendered, 
and wherein a browser scripting engine executes the client-side scripting code. 
However, Koike discloses a proxy server assembling a dynamic data document for 
display at a client browser wherein an in-memory copy of a dynamic data document is 
rendered, and wherein a browser scripting engine executes the client-side scripting 
code (Figures 6-8; column 7, lines 13-33). It would have been obvious to one of 
ordinary skill in the art at the time of the applicant's invention to have combined Koike 
with Meyerzon, since it would have allowed a user to more quickly receive the dynamic 
data. 
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In regard to dependent claim 3, Meyerzon further discloses wherein the one or 
more images with textual content embedded therein include at least one of an in-line 
GIF image and an in-line JPEG image. (Meyerzon Col 9 Lines 37-46 i.e. an image is 
retrieved to display on a web page and it is well known in the art the images displayed 
on web pages can be a gif and jpeg image). 

In regard to dependent claim 7, Meyerzon further discloses initializing a first list 
with seed values (Meyerzon Col 17 Lines 25-26 i.e. assigning a current crawl number to 
the current web crawl); checking if there are any URLs to be processed and in response 
that any URL exists to be processed then performing the follow sub-steps of (Meyerzon 
Col 17 Lines 28-29 i.e. determine whether an electronic document has been retrieved): 
determining if a URL is in a second list; and in response that a URL is not in the second 
list then performing the following sub-steps of: inserting the URL into the first list; 
scheduling the URL for crawling; crawling the URL when scheduled to do so; removing 
the URL from the first list after the scheduled crawling; entering the URL into the second 
list (Meyerzon Col 9 Lines 64 and Col 10 Lines 1-1 1 i.e. history map checks each 
hyperlink URL to determine if it is already listed in the history map, if not the URLs are 
added and are marked as not being crawled and added to the transaction log. The 
history map includes a number crawled and number modified data); and repeating the 
checking step until there are no more URLs to be processed; where if the determining 
step determines that the URL is in the second list then repeating the checking step until 
there are no more URLs to be processed. (Meyerzon Col 12 Lines 1-17 i.e. retrieves 
and processed a URL until there are none left in the transaction log) 
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In regard to dependent claim 8, Meyerzon further discloses wherein the sub-step 
of initializing a first list with seed values further includes the list being a URL pool. 
(Meyerzon Col 7 Lines 65-67 i.e. retrieving processing URLs from the transaction log) 

In regard to dependent claim 9, Meyerzon further discloses wherein the sub-step 
of determining if a URL is in a second list further includes the second list being a visited 
pool. (Meyerzon Figure 4 shows a column indicating the number crawled and modified) 

In regard to dependent claim 10, Meyerzon discloses wherein the sub-step of 
crawling further comprises the sub-steps of: issuing an HTTP command to a web server 
named in the URL; receiving contents of an HTML page as a result of the issued HTTP 
command; and passing on the contents of the HTML page to a Page Rendering 
subroutine. (Meyerzon Col 8 Lines 26-35 i.e. the client computer transmits data to a 
search engine, the search engine examines its associated index to find documents and 
returns the documents which are secondary documents and lists the documents for the 
user to view) 

In regard to dependent claim 1 1 , Meyerzon discloses receiving the contents of 
the HTML page in the Page Rendering subroutine; building an in-memory 
representation of a layout for the HTML page and if more data is needed to properly 
form the representation, then performing the sub-steps of (Meyerzon Col 7 Lines 60-65 
and Col 8 Lines 15-20 i.e. web crawler program searches remote server computers 
connected to the network for electronic documents and retrieves electronic documents 
and associated data and a browser displays documents to a user); requesting additional 
web-based information; gathering this additional web-based information; inserting any 
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URLs associated with this additional web-based information into the second list and a 
URL cache (Meyerzon Col 9 Lines 37-46 i.e. an image is retrieved to display on a web 
page); building a final amended representation; and forwarding the final amended 
representation to an Extraction subroutine; wherein, if no more data is needed to 
properly form the in-memory representation, then forwarding the in-memory 
representation to the Extraction subroutine. (Meyerzon Col 1 6 Lines 32-44) 

In regard to dependent claim 13, Meyerzon discloses receiving a text map from 
the Page Extractor subroutine; processing the text map in an application-specific 
manner (Meyerzon Col 2 Lines 48-51 i.e. information from the electronic document 
retrieved from the web crawl is stored in an index to begin the routine); applying data 
extraction patterns to the text map (Meyerzon Col 5 Lines 7-8 i.e. extracting data from 
each of the retrieved documents); translating resultant data from the applying step; 
forwarding any URLs present in the text map to a manager subroutine; and forwarding 
any extracted data and metadata to application logic. (Meyerzon Col 9 Lines 64 and Col 
10 Lines 1-11 i.e. history map checks each hyperlink URL to determine if it is already 
listed in the history map, if not the URLs are added and are marked as not being 
crawled and added to the transaction log. The history map includes a number crawled 
and number modified data) 

In regard to dependent claim 16, in addition to the following reflect similar subject 
matter claimed in claim 3 and are rejected along the same rationale. (Meyerzon Col 20 
Lines 13-14 i.e. computer readable medium having computer executable instruction) 
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As per claims 23 and 25, the applicant discloses the limitations similar to those in 
claim 21 . Claims 23 and 25 are similarly rejected. 

6. Claims 4-6 and 1 7-1 9 are rejected under 35 U.S.C. 1 03(a) as being unpatentable 
over Meyerzon, Tuli, and Koike and in further view of Hobbs (US 6523022, filed 
7/7/1999). 

In regard to dependent claim 4, Meyerzon does not specifically executing one or 
more Java applets with textual content embedded therein. However, Hobbs mentions 
that Java applets are used (Hobbs Col 28 Line 35). It would have been obvious to one 
of ordinary skill in the art at the time of the invention, to apply Hobbs to Meyerzon, 
providing Meyerzon the benefit of using Java Applets for web pages in the process of 
searching the web documents because Java Applets are compatible with many web 
pages and browsers. 

In regard to dependent claim 5, Meyerzon does not specifically mention wherein 
the loading secondary documents further comprises the loading of secondary 
documents including web documents selected from the group of documents consisting 
of in-line frames, frames, and equivalents. However, Hobbs mentions that frames and 
in-line frames are used (Hobbs Col 7 Lines 63 through Col 8 Lines 1-34). It would have 
been obvious to one of ordinary skill in the art at the time of the invention, to apply 
Hobbs to Meyerzon, providing Meyerzon the benefit of using frames and in-line frames 
for easy viewing for the user. 
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In regard to dependent claim 6, Meyerzon does not specifically mention wherein 
the loading secondary documents further comprises the loading of secondary 
documents including one or more Java Script components with textual content 
embedded therein. However, Hobbs mentions that Java applets are used (Hobbs Col 
28 Line 35). It would have been obvious to one of ordinary skill in the art at the time of 
the invention, to apply Hobbs to Meyerzon, providing Meyerzon the benefit of using 
Java Scripts for web pages in the process of searching the web documents because 
Java Scripts are compatible with many web pages and browsers. 

In regard to dependent claim 17, the applicant discloses the limitations 
substantially similar to those in claim 4 and the same rejection is incorporated herein 
(Meyerzon Col 20 Lines 13-14 i.e. computer readable medium having computer 
executable instruction). 

In regard to dependent claim 18, the applicant discloses the limitations 
substantially similar to those in claim 5 and the same rejection is incorporated herein 
(Meyerzon Col 20 Lines 13-14 i.e. computer readable medium having computer 
executable instruction). 

In regard to dependent claim 19, the applicant discloses the limitations 
substantially similar to those in claim 6 and the same rejection is incorporated herein 
(Meyerzon Col 20 Lines 13-14 i.e. computer readable medium having computer 
executable instruction). 
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7. Claims 12, 22, and 24 are rejected under 35 U.S.C. 103(a) as being 
unpatentable over Meyerzon, Tuli, and Koiske and further in view of Lawrence et al. (US 
6289342, filed 20 May 1998, hereafter Lawrence). 

In regard to dependent claim 12, Meyerzon discloses accessing a set of memory 
structures of the Page Renderer (Meyerzon Col 6 Lines 23-60 i.e. accessing local and 
remote memory devices); copying a text portion of the structures into a text map 
(Meyerzon Col 15 Lines 15-16 i.e. copying all of the history map entries into the 
transaction log as entries); inspecting any in-line GIF and JPEG image references in the 
memory structures (Meyerzon Col 9 Lines 37-46 i.e. an image is retrieved to display on 
a web page and it is well known in the art the images displayed on web pages can be a 
gif and jpeg image); extracting alternate text attributes (Meyerzon Col 5 Lines 7-8 i.e. 
extracting data from each of the retrieved documents); adding the alternate text 
attributes to a text map (Meyerzon Col 2 Lines 48-51 i.e. information from the electronic 
document retrieved from the web crawl is stored in an index); extracting text content 
from the GIF and JPEG images; adding text content from the images to the text map 
(Meyerzon Col 9 Lines 37-46 i.e. an image is retrieved to display on a web page and it 
is well known in the art the images displayed on web pages can be a gif and jpeg image 
Col 5 Lines 7-8 i.e. extracting data from each of the retrieved documents Col 2 Lines 
48-51 i.e. information from the electronic document retrieved from the web crawl is 
stored in an index); and forwarding the text map to a Page Summarizer subroutine. 
(Meyerzon Col 9 Lines 64 and Col 10 Lines 1-1 1 i.e. history map checks each hyperlink 
URL to determine if it is already listed in the history map, if not the URLs are added and 
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are marked as not being crawled and added to the transaction log. The history map 
includes a number crawled and number modified data) 

Meyerzon does not specifically mention invoking an optical character recognition 
engine; analyzing any in-line GIF and JPEG images using the optical character 
recognition engine for text content. However, Lawrence mentions extracting data using 
optical character recognition (Lawrence Col 7 Lines 51-56 i.e. conversion to electronic 
form by use of OCR). It would have been obvious to one of ordinary skill in the art at the 
time of the invention, to apply Lawrence to Meyerzon, providing Meyerzon the benefit of 
extracting content from a document using OCR, which is quicker the typing out an entire 
document manually by hand. 

As per claims 22 and 24, the applicant discloses the limitations substantially 
similar to those in claim 12. Claims 22 and 24 are similarly rejected. 

Response to Arguments 

8. Applicant's arguments with respect to Blumenthal have been considered but are 
moot in view of the new ground(s) of rejection. 

9. Applicant's arguments with respect to Meyerzon, in the Pre-Appeal Brief Request 
filed 12 June 2008 have been fully considered but they are not persuasive. 

The applicant's initial argument with respect to Meyerzon is based upon the 
applicant's belief that Meyerzon fails to disclose "a web-browser at the server as part of 
the web crawler (page 1 )." The applicant further states, "the web-crawler claimed in the 
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present invention isn't merely acting as a web-browser but rather a separate web 
browser is being executed as part of the web crawler (page 1 )." The examiner 
disagrees with this assertion. The applicant's claim clearly states, "executing, at the 
server, a web-browser, as part of the web-crawler (claim 21 , lines 5)." Contrary to the 
applicant's argument, the web browser is not separate from the web crawler; in fact, the 
web browser is a portion of the web crawler. Although, the applicant claims, "retrieving, 
to the server, with a web crawler from a network address, a dynamic data document 
with client-side scripting code therein (claim 21, lines 2-3)," this limitation does not 
restrict the browser, being executed as part of the web crawler, from retrieving the 
dynamic data document with client-side scripting code therein. Meyerzon discloses a 
web-crawler, having a web-browser as part of the web-crawler, which retrieves a 
document to the server (Figure 2), and executing at the server the web browser to 
render in memory copy of the retrieved document (column 7, line 60- column 8, line 20; 
BPAI Decision of 22 August 2007: page 6). For these reasons, this argument is not 
persuasive. 

The applicant further argues that Meyerzon fails to teach a web browser 
"displays an in-memory copy of the data document which has been retrieved, wherein 
the in-memory copy of the data document maintains a web-browser display format and 
a web-browser display layout of the dynamic data document when displayed in a web 
browser (page 2)." The examiner respectfully disagrees. Meyerzon discloses the 
rendering for display of the retrieved document (column 7, line 6- column 8, line 20). 
Further, "upon retrieval of a web page document by Meyerzon's web crawler 206, a filter 



Application/Control Number: 09/607,370 Page 14 

Art Unit: 2178 

314 parses the document and returns text and properties to be included in the in- 
memory data structure of the web crawler. The text information is information which is 
to be displayed for viewing at an end-user's web browser as claimed as indicated by 
Meyerzon's disclosure (col. 9, II. 34-36) that such stored information includes text 
formatting data (BPAI Decision of 22 August 2007: page 6)." Therefore, Meyerzon 
discloses rendering an in memory copy of the data, including display format, layout, and 
content data. For these reasons, the applicant's argument is not persuasive. 

Conclusion 

Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to KYLE R. STORK whose telephone number is (571)272- 
4130. The examiner can normally be reached on Monday-Friday (8:00-4:30). 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, Stephen Hong can be reached on (571) 272-4124. The fax phone number 
for the organization where this application or proceeding is assigned is 571-273-8300. 
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Patent Application Information Retrieval (PAIR) system. Status information for 

published applications may be obtained from either Private PAIR or Public PAIR. 

Status information for unpublished applications is available through Private PAIR only. 

For more information about the PAIR system, see http://pair-direct.uspto.gov. Should 

you have questions on access to the Private PAIR system, contact the Electronic 
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Kyle R Stork 
Examiner 
Art Unit 2178 

/Stephen S. Hong/ 

Supervisory Patent Examiner, Art 

Unit 2178 
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